We will be investigating the dataset spambase.csv from OpenML-100 databases. This database concerns emails, of which some were classified as spam emails (~39%), whereas the rest were work and personal emails. After getting rid of the first index column, and making sure there are no NAN cells, we are left with a 4601x56 dataframe.
According to found documentation (https://www.openml.org/search?type=data&sort=runs&id=44&status=active), the columns of the dataframes are the following:
- 48 continuous real [0,100] attributes of type word_freq_WORD = percentage of words in the e-mail that match WORD, i.e. 100 * (number of times the WORD appears in the e-mail) / total number of words in e-mail. A "word" in this case is any string of alphanumeric characters bounded by non-alphanumeric characters or end-of-string.
- 6 continuous real [0,100] attributes of type char_freq_CHAR = percentage of characters in the e-mail that match CHAR, i.e. 100 * (number of CHAR occurences) / total characters in e-mail
- 1 continuous real [1,...] attribute of type capital_run_length_average = average length of uninterrupted sequences of capital letters
- 1 nominal {0,1} class attribute of type spam = denotes whether the e-mail was considered spam (1) or not (0), i.e. unsolicited commercial e-mail.
Each word frequency is on average low (<1% of all words), however in some mails a word can make up to 40-50% of all the words of a mail. Similarly with the specified characters. The average mean capital length between mails was ~5 characters, however the median was ~2 and the maximum was >1100 indicating a skew to the right.
After initial investigation and cleaning of the data, we proceeded with training models to predict whether a mail is spam. We plan to train 3 models:
We divided our dataset into 90% train and 10% eval subsets. During training of our models we did a 5-fold cross-validation on the training set and then evaluated the final performance on the eval set. Cross-validation helps with the potentially high-variance single split of train-test data, which could skew the results. The cross-validation train-test results shown later will be the averaged train and test scores of the model on the 5 runs of the cross-validation process. The completely separate eval set helps with models we fine-tuned for hyperparameters - once we select hyperparameters using cross-validation, we run the final model once on the eval set to get an unbiased performance indicator.
*Due to complexity limitations of TabPFN, we trained it only on a random 1000 rows of the input data. (Larger amounts of data returned an error)
Below we present a table with the results of each of the models. Since this is a 0-1 classification task, we operate on the accuracy metric. For the RF model, we show the optimal one we found using cross-validation. The CV Train and CV Test columns show the averaged accuracy over the 5 runs of cross-validation.
| CV train accuracy | CV test accuracy | Eval accuracy | |
|---|---|---|---|
| Logistic Regression | 0.927 | 0.922 | 0.941 |
| Random Forest* | 0.958 | 0.936 | 0.961 |
| TabPFN** | 0.987 | 0.929 | 0.967 |
*Hyperparameters selected: (n_estimators, max_depth, max_features): (200, 8, 0.3) **Trained only on 1000 rows of the dataframe
import numpy as np
import pandas as pd
spambase = pd.read_csv("spambase.csv")
spambase.head()
| Unnamed: 0 | word_freq_make | word_freq_address | word_freq_all | word_freq_3d | word_freq_our | word_freq_over | word_freq_remove | word_freq_internet | word_freq_order | ... | word_freq_table | word_freq_conference | char_freq_%3B | char_freq_%28 | char_freq_%5B | char_freq_%21 | char_freq_%24 | char_freq_%23 | capital_run_length_average | TARGET | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0.00 | 0.64 | 0.64 | 0.0 | 0.32 | 0.00 | 0.00 | 0.00 | 0.00 | ... | 0.0 | 0.0 | 0.00 | 0.000 | 0.0 | 0.778 | 0.000 | 0.000 | 3.756 | 1 |
| 1 | 1 | 0.21 | 0.28 | 0.50 | 0.0 | 0.14 | 0.28 | 0.21 | 0.07 | 0.00 | ... | 0.0 | 0.0 | 0.00 | 0.132 | 0.0 | 0.372 | 0.180 | 0.048 | 5.114 | 1 |
| 2 | 2 | 0.06 | 0.00 | 0.71 | 0.0 | 1.23 | 0.19 | 0.19 | 0.12 | 0.64 | ... | 0.0 | 0.0 | 0.01 | 0.143 | 0.0 | 0.276 | 0.184 | 0.010 | 9.821 | 1 |
| 3 | 3 | 0.00 | 0.00 | 0.00 | 0.0 | 0.63 | 0.00 | 0.31 | 0.63 | 0.31 | ... | 0.0 | 0.0 | 0.00 | 0.137 | 0.0 | 0.137 | 0.000 | 0.000 | 3.537 | 1 |
| 4 | 4 | 0.00 | 0.00 | 0.00 | 0.0 | 0.63 | 0.00 | 0.31 | 0.63 | 0.31 | ... | 0.0 | 0.0 | 0.00 | 0.135 | 0.0 | 0.135 | 0.000 | 0.000 | 3.537 | 1 |
5 rows × 57 columns
spambase.isna().sum().sum()
0
df = spambase.drop(spambase.columns[0], axis=1) #Cleaning first column which is just index
df
| word_freq_make | word_freq_address | word_freq_all | word_freq_3d | word_freq_our | word_freq_over | word_freq_remove | word_freq_internet | word_freq_order | word_freq_mail | ... | word_freq_table | word_freq_conference | char_freq_%3B | char_freq_%28 | char_freq_%5B | char_freq_%21 | char_freq_%24 | char_freq_%23 | capital_run_length_average | TARGET | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.00 | 0.64 | 0.64 | 0.0 | 0.32 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... | 0.0 | 0.0 | 0.000 | 0.000 | 0.0 | 0.778 | 0.000 | 0.000 | 3.756 | 1 |
| 1 | 0.21 | 0.28 | 0.50 | 0.0 | 0.14 | 0.28 | 0.21 | 0.07 | 0.00 | 0.94 | ... | 0.0 | 0.0 | 0.000 | 0.132 | 0.0 | 0.372 | 0.180 | 0.048 | 5.114 | 1 |
| 2 | 0.06 | 0.00 | 0.71 | 0.0 | 1.23 | 0.19 | 0.19 | 0.12 | 0.64 | 0.25 | ... | 0.0 | 0.0 | 0.010 | 0.143 | 0.0 | 0.276 | 0.184 | 0.010 | 9.821 | 1 |
| 3 | 0.00 | 0.00 | 0.00 | 0.0 | 0.63 | 0.00 | 0.31 | 0.63 | 0.31 | 0.63 | ... | 0.0 | 0.0 | 0.000 | 0.137 | 0.0 | 0.137 | 0.000 | 0.000 | 3.537 | 1 |
| 4 | 0.00 | 0.00 | 0.00 | 0.0 | 0.63 | 0.00 | 0.31 | 0.63 | 0.31 | 0.63 | ... | 0.0 | 0.0 | 0.000 | 0.135 | 0.0 | 0.135 | 0.000 | 0.000 | 3.537 | 1 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 4596 | 0.31 | 0.00 | 0.62 | 0.0 | 0.00 | 0.31 | 0.00 | 0.00 | 0.00 | 0.00 | ... | 0.0 | 0.0 | 0.000 | 0.232 | 0.0 | 0.000 | 0.000 | 0.000 | 1.142 | 0 |
| 4597 | 0.00 | 0.00 | 0.00 | 0.0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... | 0.0 | 0.0 | 0.000 | 0.000 | 0.0 | 0.353 | 0.000 | 0.000 | 1.555 | 0 |
| 4598 | 0.30 | 0.00 | 0.30 | 0.0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... | 0.0 | 0.0 | 0.102 | 0.718 | 0.0 | 0.000 | 0.000 | 0.000 | 1.404 | 0 |
| 4599 | 0.96 | 0.00 | 0.00 | 0.0 | 0.32 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... | 0.0 | 0.0 | 0.000 | 0.057 | 0.0 | 0.000 | 0.000 | 0.000 | 1.147 | 0 |
| 4600 | 0.00 | 0.00 | 0.65 | 0.0 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... | 0.0 | 0.0 | 0.000 | 0.000 | 0.0 | 0.125 | 0.000 | 0.000 | 1.250 | 0 |
4601 rows × 56 columns
df.describe()
| word_freq_make | word_freq_address | word_freq_all | word_freq_3d | word_freq_our | word_freq_over | word_freq_remove | word_freq_internet | word_freq_order | word_freq_mail | ... | word_freq_table | word_freq_conference | char_freq_%3B | char_freq_%28 | char_freq_%5B | char_freq_%21 | char_freq_%24 | char_freq_%23 | capital_run_length_average | TARGET | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 4601.000000 | 4601.000000 | 4601.000000 | 4601.000000 | 4601.000000 | 4601.000000 | 4601.000000 | 4601.000000 | 4601.000000 | 4601.000000 | ... | 4601.000000 | 4601.000000 | 4601.000000 | 4601.000000 | 4601.000000 | 4601.000000 | 4601.000000 | 4601.000000 | 4601.000000 | 4601.000000 |
| mean | 0.104553 | 0.213015 | 0.280656 | 0.065425 | 0.312223 | 0.095901 | 0.114208 | 0.105295 | 0.090067 | 0.239413 | ... | 0.005444 | 0.031869 | 0.038575 | 0.139030 | 0.016976 | 0.269071 | 0.075811 | 0.044238 | 5.191515 | 0.394045 |
| std | 0.305358 | 1.290575 | 0.504143 | 1.395151 | 0.672513 | 0.273824 | 0.391441 | 0.401071 | 0.278616 | 0.644755 | ... | 0.076274 | 0.285735 | 0.243471 | 0.270355 | 0.109394 | 0.815672 | 0.245882 | 0.429342 | 31.729449 | 0.488698 |
| min | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 0.000000 |
| 25% | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.588000 | 0.000000 |
| 50% | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.065000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 2.276000 | 0.000000 |
| 75% | 0.000000 | 0.000000 | 0.420000 | 0.000000 | 0.380000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.160000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.188000 | 0.000000 | 0.315000 | 0.052000 | 0.000000 | 3.706000 | 1.000000 |
| max | 4.540000 | 14.280000 | 5.100000 | 42.810000 | 10.000000 | 5.880000 | 7.270000 | 11.110000 | 5.260000 | 18.180000 | ... | 2.170000 | 10.000000 | 4.385000 | 9.752000 | 4.081000 | 32.478000 | 6.003000 | 19.829000 | 1102.500000 | 1.000000 |
8 rows × 56 columns
X = df.loc[:, df.columns != 'TARGET']
X.head()
| word_freq_make | word_freq_address | word_freq_all | word_freq_3d | word_freq_our | word_freq_over | word_freq_remove | word_freq_internet | word_freq_order | word_freq_mail | ... | word_freq_edu | word_freq_table | word_freq_conference | char_freq_%3B | char_freq_%28 | char_freq_%5B | char_freq_%21 | char_freq_%24 | char_freq_%23 | capital_run_length_average | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.00 | 0.64 | 0.64 | 0.0 | 0.32 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | ... | 0.00 | 0.0 | 0.0 | 0.00 | 0.000 | 0.0 | 0.778 | 0.000 | 0.000 | 3.756 |
| 1 | 0.21 | 0.28 | 0.50 | 0.0 | 0.14 | 0.28 | 0.21 | 0.07 | 0.00 | 0.94 | ... | 0.00 | 0.0 | 0.0 | 0.00 | 0.132 | 0.0 | 0.372 | 0.180 | 0.048 | 5.114 |
| 2 | 0.06 | 0.00 | 0.71 | 0.0 | 1.23 | 0.19 | 0.19 | 0.12 | 0.64 | 0.25 | ... | 0.06 | 0.0 | 0.0 | 0.01 | 0.143 | 0.0 | 0.276 | 0.184 | 0.010 | 9.821 |
| 3 | 0.00 | 0.00 | 0.00 | 0.0 | 0.63 | 0.00 | 0.31 | 0.63 | 0.31 | 0.63 | ... | 0.00 | 0.0 | 0.0 | 0.00 | 0.137 | 0.0 | 0.137 | 0.000 | 0.000 | 3.537 |
| 4 | 0.00 | 0.00 | 0.00 | 0.0 | 0.63 | 0.00 | 0.31 | 0.63 | 0.31 | 0.63 | ... | 0.00 | 0.0 | 0.0 | 0.00 | 0.135 | 0.0 | 0.135 | 0.000 | 0.000 | 3.537 |
5 rows × 55 columns
y = df.loc[:, df.columns == 'TARGET']
y.head()
| TARGET | |
|---|---|
| 0 | 1 |
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=2)
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_validate
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
kf = KFold(n_splits = 5)
clf = LogisticRegression(random_state=2)
clf_scores = cross_validate(clf, X_train, y_train, cv=kf, n_jobs=-1, scoring='accuracy', return_train_score=True)
print("Accuracy: Train: ", np.mean(np.array(clf_scores['train_score'])), " Test: ", np.mean(np.array(clf_scores['test_score'])))
Accuracy: Train: 0.9268115942028985 Test: 0.9219806763285024
clf_final = LogisticRegression(random_state=2).fit(X_train, y_train)
print("Eval accuracy: ", accuracy_score(y_test, clf_final.predict(X_test)))
Eval accuracy: 0.9414316702819957
C:\Users\Antek\anaconda3\lib\site-packages\sklearn\utils\validation.py:1111: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
y = column_or_1d(y, warn=True)
C:\Users\Antek\anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
n_iter_i = _check_optimize_result(
from sklearn.ensemble import RandomForestClassifier
modelsRF = []
for n_estim in [50, 100, 200]:
for max_dep in [2, 5, 8]:
for max_feat in [0.1, 0.3, 0.5, 0.8]:
print("Training model with (n_estimators, max_depth, max_features): ", (n_estim, max_dep, max_feat))
curr_regr = RandomForestClassifier(n_estimators=n_estim, max_depth = max_dep, max_features = max_feat, random_state = 1)
curr_scores = cross_validate(curr_regr, X_train, y_train, cv=kf, n_jobs=-1, scoring='accuracy', return_train_score=True)
modelsRF.append((curr_regr, n_estim, max_dep, max_feat, curr_scores))
print("Accuracy: Train: ", np.mean(np.array(curr_scores['train_score'])), " Test: ", np.mean(np.array(curr_scores['test_score'])))
Training model with (n_estimators, max_depth, max_features): (50, 2, 0.1) Accuracy: Train: 0.8749396135265701 Test: 0.870048309178744 Training model with (n_estimators, max_depth, max_features): (50, 2, 0.3) Accuracy: Train: 0.8948671497584542 Test: 0.8896135265700483 Training model with (n_estimators, max_depth, max_features): (50, 2, 0.5) Accuracy: Train: 0.8936594202898551 Test: 0.8874396135265702 Training model with (n_estimators, max_depth, max_features): (50, 2, 0.8) Accuracy: Train: 0.879951690821256 Test: 0.8739130434782609 Training model with (n_estimators, max_depth, max_features): (50, 5, 0.1) Accuracy: Train: 0.9237318840579711 Test: 0.9171497584541063 Training model with (n_estimators, max_depth, max_features): (50, 5, 0.3) Accuracy: Train: 0.9337560386473431 Test: 0.923913043478261 Training model with (n_estimators, max_depth, max_features): (50, 5, 0.5) Accuracy: Train: 0.933695652173913 Test: 0.9214975845410628 Training model with (n_estimators, max_depth, max_features): (50, 5, 0.8) Accuracy: Train: 0.9327294685990338 Test: 0.9193236714975844 Training model with (n_estimators, max_depth, max_features): (50, 8, 0.1) Accuracy: Train: 0.9503623188405796 Test: 0.9301932367149759 Training model with (n_estimators, max_depth, max_features): (50, 8, 0.3) Accuracy: Train: 0.9569444444444445 Test: 0.932608695652174 Training model with (n_estimators, max_depth, max_features): (50, 8, 0.5) Accuracy: Train: 0.957669082125604 Test: 0.9304347826086957 Training model with (n_estimators, max_depth, max_features): (50, 8, 0.8) Accuracy: Train: 0.9573067632850242 Test: 0.9277777777777778 Training model with (n_estimators, max_depth, max_features): (100, 2, 0.1) Accuracy: Train: 0.8795893719806763 Test: 0.8760869565217393 Training model with (n_estimators, max_depth, max_features): (100, 2, 0.3) Accuracy: Train: 0.8993961352657006 Test: 0.8946859903381643 Training model with (n_estimators, max_depth, max_features): (100, 2, 0.5) Accuracy: Train: 0.8975845410628018 Test: 0.8905797101449275 Training model with (n_estimators, max_depth, max_features): (100, 2, 0.8) Accuracy: Train: 0.8850845410628019 Test: 0.8789855072463768 Training model with (n_estimators, max_depth, max_features): (100, 5, 0.1) Accuracy: Train: 0.9263888888888889 Test: 0.9169082125603865 Training model with (n_estimators, max_depth, max_features): (100, 5, 0.3) Accuracy: Train: 0.9347826086956521 Test: 0.9234299516908212 Training model with (n_estimators, max_depth, max_features): (100, 5, 0.5) Accuracy: Train: 0.932548309178744 Test: 0.9229468599033815 Training model with (n_estimators, max_depth, max_features): (100, 5, 0.8) Accuracy: Train: 0.932487922705314 Test: 0.920048309178744 Training model with (n_estimators, max_depth, max_features): (100, 8, 0.1) Accuracy: Train: 0.9515700483091788 Test: 0.9318840579710145 Training model with (n_estimators, max_depth, max_features): (100, 8, 0.3) Accuracy: Train: 0.9583333333333333 Test: 0.9340579710144926 Training model with (n_estimators, max_depth, max_features): (100, 8, 0.5) Accuracy: Train: 0.9585144927536232 Test: 0.9318840579710145 Training model with (n_estimators, max_depth, max_features): (100, 8, 0.8) Accuracy: Train: 0.9582125603864735 Test: 0.9292270531400966 Training model with (n_estimators, max_depth, max_features): (200, 2, 0.1) Accuracy: Train: 0.8794082125603865 Test: 0.8768115942028987 Training model with (n_estimators, max_depth, max_features): (200, 2, 0.3) Accuracy: Train: 0.8977657004830919 Test: 0.8939613526570047 Training model with (n_estimators, max_depth, max_features): (200, 2, 0.5) Accuracy: Train: 0.9026570048309178 Test: 0.8958937198067634 Training model with (n_estimators, max_depth, max_features): (200, 2, 0.8) Accuracy: Train: 0.883816425120773 Test: 0.8801932367149758 Training model with (n_estimators, max_depth, max_features): (200, 5, 0.1) Accuracy: Train: 0.9255434782608696 Test: 0.917391304347826 Training model with (n_estimators, max_depth, max_features): (200, 5, 0.3) Accuracy: Train: 0.9349033816425122 Test: 0.9251207729468598 Training model with (n_estimators, max_depth, max_features): (200, 5, 0.5) Accuracy: Train: 0.9331521739130435 Test: 0.9219806763285024 Training model with (n_estimators, max_depth, max_features): (200, 5, 0.8) Accuracy: Train: 0.9329106280193237 Test: 0.9205314009661836 Training model with (n_estimators, max_depth, max_features): (200, 8, 0.1) Accuracy: Train: 0.9518719806763285 Test: 0.9314009661835747 Training model with (n_estimators, max_depth, max_features): (200, 8, 0.3) Accuracy: Train: 0.9581521739130435 Test: 0.9355072463768115 Training model with (n_estimators, max_depth, max_features): (200, 8, 0.5) Accuracy: Train: 0.9588164251207729 Test: 0.9316425120772948 Training model with (n_estimators, max_depth, max_features): (200, 8, 0.8) Accuracy: Train: 0.9589371980676329 Test: 0.9323671497584541
results_chart_RF = pd.DataFrame(columns = ['n_estimators', 'max_depth', 'max_features', 'score_type', 'Accuracy'])
for m in modelsRF:
_, n_estim, max_dep, max_feat, scores = m
curr_frame = pd.DataFrame(columns = ['n_estimators', 'max_depth', 'max_features', 'score_type', 'Accuracy'])
for sc_type in ['test_score', 'train_score']:
for k in range(5):
curr_scor = scores[sc_type][k]
curr_frame = curr_frame.append(
pd.DataFrame([[n_estim, max_dep, max_feat, sc_type, curr_scor]], columns = ['n_estimators', 'max_depth', 'max_features', 'score_type', 'Accuracy']))
#print(curr_frame)
results_chart_RF = results_chart_RF.append(curr_frame)
import plotly.express as px
px.box(results_chart_RF, y='Accuracy', color='score_type', facet_col='max_features', x='max_depth', facet_row='n_estimators')
RF_final = RandomForestClassifier(n_estimators=200, max_depth = 8, max_features = 0.3, random_state = 1).fit(X_train, y_train)
print("Eval accuracy: ", accuracy_score(y_test, RF_final.predict(X_test)))
C:\Users\Antek\AppData\Local\Temp\ipykernel_10616\4069508598.py:1: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel().
Eval accuracy: 0.9609544468546638
from tabpfn import TabPFNClassifier
tabpfn = TabPFNClassifier(device='cpu', N_ensemble_configurations=32)
tabpfn_scores = cross_validate(tabpfn, X_train[:1000], y_train[:1000], cv=kf, n_jobs=-1, scoring='accuracy', return_train_score=True)
print("Accuracy: Train: ", np.mean(np.array(tabpfn_scores['train_score'])), " Test: ", np.mean(np.array(tabpfn_scores['test_score'])))
Loading model that can be used for inference only Using a Transformer with 25.82 M parameters Loading model that can be used for inference only Using a Transformer with 25.82 M parameters Loading model that can be used for inference only Using a Transformer with 25.82 M parameters Loading model that can be used for inference only Using a Transformer with 25.82 M parameters Loading model that can be used for inference only Using a Transformer with 25.82 M parameters Loading model that can be used for inference only Using a Transformer with 25.82 M parameters Accuracy: Train: 0.98675 Test: 0.9289999999999999
tabpfn_final = TabPFNClassifier(device='cpu', N_ensemble_configurations=32).fit(X_train[:1000], y_train[:1000])
print("Eval accuracy: ", accuracy_score(y_test, tabpfn_final.predict(X_test)))
Loading model that can be used for inference only Using a Transformer with 25.82 M parameters
C:\Users\Antek\anaconda3\lib\site-packages\sklearn\utils\validation.py:1111: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
Eval accuracy: 0.9674620390455532